Picture for Weicai Ye

Weicai Ye

VINO: A Unified Visual Generator with Interleaved OmniModal Context

Add code
Jan 05, 2026
Viaarxiv icon

In-Context Audio Control of Video Diffusion Transformers

Add code
Dec 21, 2025
Viaarxiv icon

Kling-Omni Technical Report

Add code
Dec 18, 2025
Figure 1 for Kling-Omni Technical Report
Figure 2 for Kling-Omni Technical Report
Figure 3 for Kling-Omni Technical Report
Figure 4 for Kling-Omni Technical Report
Viaarxiv icon

FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers

Add code
Jun 05, 2025
Figure 1 for FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers
Figure 2 for FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers
Figure 3 for FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers
Figure 4 for FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers
Viaarxiv icon

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Add code
Mar 31, 2025
Figure 1 for Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation
Figure 2 for Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation
Figure 3 for Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation
Figure 4 for Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation
Viaarxiv icon

SketchVideo: Sketch-based Video Generation and Editing

Add code
Mar 30, 2025
Figure 1 for SketchVideo: Sketch-based Video Generation and Editing
Figure 2 for SketchVideo: Sketch-based Video Generation and Editing
Figure 3 for SketchVideo: Sketch-based Video Generation and Editing
Figure 4 for SketchVideo: Sketch-based Video Generation and Editing
Viaarxiv icon

FullDiT: Multi-Task Video Generative Foundation Model with Full Attention

Add code
Mar 25, 2025
Viaarxiv icon

CoSurfGS:Collaborative 3D Surface Gaussian Splatting with Distributed Learning for Large Scene Reconstruction

Add code
Dec 23, 2024
Viaarxiv icon

LLaVA-SLT: Visual Language Tuning for Sign Language Translation

Add code
Dec 21, 2024
Figure 1 for LLaVA-SLT: Visual Language Tuning for Sign Language Translation
Figure 2 for LLaVA-SLT: Visual Language Tuning for Sign Language Translation
Figure 3 for LLaVA-SLT: Visual Language Tuning for Sign Language Translation
Figure 4 for LLaVA-SLT: Visual Language Tuning for Sign Language Translation
Viaarxiv icon

Splatter-360: Generalizable 360$^{\circ}$ Gaussian Splatting for Wide-baseline Panoramic Images

Add code
Dec 09, 2024
Viaarxiv icon